Efficient top-k simrank-based similarity join
نویسندگان
چکیده
منابع مشابه
Fast top-k similarity join for SimRank
SimRank is a well-studied similarity measure between two nodes in a network. However, evaluating SimRank of all nodes in a network is not only time-consuming but also not pragmatic, since users are only interested in the most similar pairs in many real-world applications. This paper focuses on topk similarity join based on SimRank. In this work, we first present an incremental algorithm for com...
متن کاملEfficient SimRank-based Similarity Join Over Large Graphs
Graphs have been widely used to model complex data in many real-world applications. Answering vertex join queries over large graphs is meaningful and interesting, which can benefit friend recommendation in social networks and link prediction, etc. In this paper, we adopt “SimRank” to evaluate the similarity of two vertices in a large graph because of its generality. Note that “SimRank” is purel...
متن کاملTop-k Similarity Join over Multi-valued Objects
The top-k similarity joins have been extensively studied and used in a wide spectrum of applications such as information retrieval, decision making, spatial data analysis and data mining. Given two sets of objects U and V, a top-k similarity join returns k pairs of most similar objects from U×V. In the conventional model of top-k similarity join processing, an object is usually regarded as a po...
متن کاملProcessing Top-k Join Queries
We consider the problem of efficiently finding the top-k answers for join queries over web-accessible databases. Classical algorithms for finding top-k answers use branch-and-bound techniques to avoid computing scores of all candidates in identifying the top-k answers. To be able to apply such techniques, it is critical to efficiently compute (lower and upper) bounds and expected scores of cand...
متن کاملReverse Engineering Top-k Join Queries
Ranked lists have become a fundamental tool to represent the most important items taken from a large collection of data. Search engines, sports leagues and e-commerce platforms present their results, most successful teams and most popular items in a concise and structured way by making use of ranked lists. This paper introduces the PALEO-J framework which is able to reconstruct top-k database q...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the VLDB Endowment
سال: 2014
ISSN: 2150-8097
DOI: 10.14778/2735508.2735520